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FOREWORD 


The problem which the Jet Propulsion Laboratory asked the Mathematics 
Clinic at the Claremont Graduate School to consider was twofold in character. 
In essence it consisted of the following: 

(a) given specified mathematical models of the MOSFET device to 
extract, from data supplied by J.P.L., the optimal values of 
the model -dependent parameters; 

(b) to assess the sensitivity of the several models to variations 
of the parameters from their optimal values. 

This report describes the approach used, and the conclusions reached, by the 
Clinic in tackling these two questions. In the event, we confined ourselves 
to just three MOSFET models, all one-dimensional, and in one of which dif- 
fusion (as well as convection) currents are taken into account. Although 
we feel that significant progress has been made as regards the tasks (a) 
and (b) it is also our view that much still remains to be done for a fully 
comprehensive and systematic study. 

It is a pleasure to thank all the individuals Involved in the successful 
operation of the Clinic; the student members of the team for their persever- 
ance when difficulties, often mystifying, occurred; Mike Robkin, second 
year Harvey Mudd College student, who was employed by the Clinic to carry 
out the bulk of the computing; to Professor Mario Martel! i, the Faculty Con- 
sultant in the second semester, for his interest, inspiration and enthusiasm; 
to Professor Hedley Morris, who visited for a short time and whose 1984 
Summer Clinic Report (in conjunction with Richard Everson) served as the 


basis of the work of our Clinic; to Professor Ellis Cumberbatch who 
organised the creation of the Clinic; to Joy Marshall, the Mathematics 
Clinic secretary, for patient typing of unfamiliar, often indecipherable 
and seemingly endless mathematics; and last, but not least, to Cesar Pina 
the liaison link with J.P.L., for his consistent help throughout the year 
and constant interest in the progress of the work. 

A listing of all the programs and subroutines used by the Clinic 
has been produced as a supplement to this report and is available, upon 
request, to 

Mathematics Clinic 
Claremont Graduate School 
Claremont, California 91711. 


Chapter 1 
Chapter 2 


Chapter 3 
Chapter 4 


Chapter 5 


Introduction 
Mathematical Models 

(a) Ihantola model 

(b) Spice 2 model 

(c) Brews model 
Optimization Methods 
Results 

(a) Ihantola model 

(b) Spice 2 model 

(c) Brews model 
Conclusion and Discussion 

(a) General overview 

(b) Sensitivity 

(c) Suggestions for future work. 


References 

Graphs 


INTRODUCTION 


Chapter 1 

The device known as the metal-oxide-semiconductor-field-effect-tran- 
sistor, or MOSFET, is described in detail in many places. (See e.g. Sze 
(1981), Morris & Everson (1984)). Briefly, it consists of dopdd seiitcon- 
ducting material (silicon) to which are connected four terminals (see 
Figure 1) at the source, drain, bulk substrate and gate. The gate is 
separated from the main body of the device by a layer of non-conducting 
material such as silicon dioxide. The silicon has a doping profile, 
which means that it has been implanted within its crystal structure with 
impurity atoms of other elements. In this way we suppose that it has 
been made 'p-type' (by implanting, for example, with boron) in the bulk 
material substrate and lightly 'n-type' (doped, for example, with phos- 
phorus) in the regions near the source and drain. Then under a sufficient- 
ly positive voltage (relative to the source) applied at the gate, an 
n-type inversion channel will be created in the silicon along which a 
drain current Ip flows when the drain voltage is sufficiently positive 
i.e. above some threshold value with respect to the source. The manner 
in which Ip depends on V QS is illustrated in Figure 2 where typical 
curves are sketched for given fixed values of V^. Such curves can be 
obtained experimentally over a range of MOSFETs of different sizes and 
properties with good accuracy. The main features consist of a near 
linear growth of current with V in the early stages, during which the 
MOSFET acts as a linear amplifier, followed by a rapid change to a 
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domain in which Ip is nearly constant as Vp^ increases. Subsequently, 
the material breaks down electrically and there is a final increase of 
Ip at very large (the avalanche region). 

There have been developed in recent years many attempts at modelling 
mathematically the physical processes that occur in MOSFET operation. 

Such processes involve the appropriate Maxwell equation for the electro- 
static potential $ , the methods of statistical mechanics to express 
charge densities in terms of <f> , the Einstein relations to simplify the 
diffusion currents and Gauss's law to formulate an expression for the 
current along the channel. This clinic is closely associated with the 
analysis of such models, which suffer from the disadvantage that many of 
the physical parameters which enter into their construction cannot be 
measured, or even defined, with any degree of certainty. Examples of 
these parameters are the length and width of the channel, the mobility 
of the carriers within the channel and the degree of doping of the semi- 
conductor: the small dimensions of MOSFETs renders the experimental 
determination of such quantities most imprecise. Consequently they must 
be deduced in an indirect manner and it is the principal objective of our 
report to describe a mathematical and numerical method by which this 
process can be carried out. 

Thus, the 'fit' of a particular model to given data (as plotted on 
a diagram like that drawn in Figure 2) is optimized with respect to the 
parameters that the model contains. In this way values are obtained 
(or 'extracted') for the unknown parameters present in the model. The 
data is provided by JPL and consists of sets of measurements of I Q over 


a range of values of V DS for specified values of Vq S and the substrate 
bias voltage V BS' Once reliable parameter values are known they may be 
incorporated within circuit simulation programs (see e.g. Vladimirescu 
and Liu (1980)) to predict behaviour in circuit design. A secondary ob- 
jective is then concerned with the sensitivity of a model with respect 
to its parameters. For example, if a particular parameter is changed 
from its optimal value by, say 10%, how do the resulting I^-V^ curves 
deviate from the optimal one? 

The MOSFET models that we examine are all one-dimensional models 
which assume that changes take place much more rapidly across the 
channel than in directions parallel to it. Thus the expressions for the 
current are derived on the basis of the 'slowly-varying channel' approx- 
imation and we should expect the theory to be more accurate for longer, 
wider MOSFETs. In fact, the data that we use is for MOSFETs of length 
from 1.2 yMs to 24 yMs and width from 2.5 yMs to 24 yMs. An interesting 
question is to determine the variation in accuracy obtained by the models 
over the different sizes of device. Unfortunately, the particular data 
sets provided do not permit a fully systematic study of this behaviour for 
different lengths of MOSFET of fixed width, and vice versa. However, the 
general trend can be ascertained and some comments in this connection are 
made in chapters 4 and 5. 

The most straightforward of the models neglects diffusion currents 
compared with the drift currents and the derivation is described in Morris 
and Everson (1984). There are two variants of this model which we have 
tested against the data and they differ only in the form assumed for the 
expressions defining the effective mobility and the effective length of 


the channel. The details are given in Chapter 2 and we refer to these 
models as the Ihantola model and the Spice 2 model. The latter is in a 
form which is suitable for use in the Spice circuit simulation program. 

Now effects of diffusion currents can be important at small values 
of V GS and so we also consider the simplest model which takes such effects 
into account. This is a so-called 'charge-sheet' model due to Brews 
(1978) and is derived in Chapter 2. We shall refer to this as the Brews 
model . 

The methods of optimization used in fitting the model to the data are 
described in Morris and Everson (1984). (See also Chapter 3 of this re- 
port). Roughly speaking the objective function is taken to be the sum 
of squares of the differences between predicted and measured values of 
Ip and this quantity is to be minimized with respect to the parameters of 
the model. The nonlinear optimization is carried out using two different 
techniques and in practice these operate in tandem. Each uses a sub- 
routine of the IMS library available on VAX. Firstly, a program SARAH 
which employs a Gauss-Newton method, is used to obtain a minimum with 
respect to some pre-set convergence criterion. The miniumum is con- 
strained to lie within a hyperplane of the parameter space and is chosen 
as the deepest arising from a large number of initial 'guesses', so 
ensuring as far as possible that the value obtained is a global one. 

The process tends to be slow so, secondly, a program MOSES, using the 
Levenberg-Marquardt algorithm (essentially a hybrid, steepest descent - 
Newton scheme) refines this value to a desired (higher) accuracy. 

The results obtained by applying the programs SARAH and MOSES to the 
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three models Ihantola, Spice 2 and Brews are described in detail in 
Chapter 4. The chief conclusions indicate that, for Ihantola and Spide 2, 
provided the data for Vg S =2 is omitted to remove the apparently important 
effects of diffusion currents, the accuracy obtained decreases as the 
model dimensions decrease although even at the smallest model (1.2 yMs x 
2.5 yMs) RMS errors of only a few percent are obtained. The same is true 
of the Brews model, with generally a somewhat larger error, but here there 
is the important distinction that all V gg values are included. The RMS 
errors are generally increased for non-zero values of V gs in the Ihantola 
and Spice 2 models: the Brews model was not adapted to non-zero V gg . 

The results for the sensitivity of each model on its parameters are 
also given in Chapter 4. It is found that each model contains parameters 
on which it depends rather critically and others to which it is relatively 
insensitive. 

Finally, in Chapter 5, an account is presented of our overall ex- 
perience in applying the programsSARAH and MOSES. Comparison of perfor- 
mance of the different models is given and their strengths and weaknesses, 
together with some of the difficulties that were encountered. Suggestions 
for future development and extension of the parameter extraction technique 
are offered. 



Chapter 2 MATHEMATICAL MODELS 


The clinic has studied three one-dimensional mathematical models of 
the MOSFET, the first two of which are derived directly from that given 
by Ihantola and Moll (1964) and discussed in detail in Morris and Everson 
(1984). These models neglect diffusion currents and differ only in the 
assumed functional form for the effective mobility, u e ^» and effective 
length, L fi ^, of the device. We refer to these models as the Ihantola 
model and the SPICE 2 model and we summarize below in (a) and (b) their 
relevant formulae. An important aspect of all models is the presence of 
a number of parameters, PI, P2,...etc., which are not known accurately 
and, as described in Chapter 3, are to be determined for each model by 
optimizing the fit of the model to the available empirical data. The 
formulae then for the first two models are as follows. (Taken from 
Morris and Everson (1984)) 


(a) Ihantola Model 

>o ■ P5 -Vf'V p '- p7 -¥)v D s 

-l n H V 0S - V BS * P1 > 3/2 - < P2 - V BS> 3/2 ] } - 
provided the drain voltage V QS < V DSAT (the saturation voltage). 
When V Q < V DSAT 

t - ! dsat 

*n i 


( 2 ) 


where Ip^j is obtained by substituting 

2 "7 

V D ■ *DSAT * V GS - P1 - P7 ♦ ^ + ( ^ 2 (V GS - 07-0.2) (PI ) Wl-V^lp) 

( 3 ) 
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into (1). The seven parameters defined in the Ihantola model, in terms 


of physical quantities, are 
2 

P1 - f {n «sf) 


?2 = (2k s <?N a ) 


OX 


P3, P4 = parameters used in defining empirical mobility law, see (4) 


P5 3 C ox Z 


P6 = parameter used in defining empirical channel length modulation, 
see (5) 


P7 = V FB ' 


In these expressions is the p-dopant concentration, n. the intrinsic 
carrier concentration, S - * the thermal potential kT/q (k * Boltzmann's 
constant, T= temperature, q = electronic charge), k s the semi-conductor 
permittivity, C Qx the oxide capacitance per unit area, Z and L the width 
and length of the device respectively and Vpg the flatband voltage. 

The expression for and L e ^ in this model are chosen to be. 


eff 


1 + P^(V GS -Pl-P7-P2»^l) 


(4) 


L eff ■ 1 ' P6 


<V V DSAT * P > - W’ * < P1 - V) 


BS 


(5) 
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(b) Spice 2 Model 

In this model the current in the subsaturation and saturation regions 
of drain voltage is still expressed in the form (1) and (2) with 
given by (3). However, different and much more complicated empirical 
expressions are assumed for and in terms of the parameters. 

There are nine parameters in this model defined by 


P1 * V FB 
P2 - N, 


P3,P4,P5 = parameters used in defining mobility, see (7) and (8) , 

P6 = length of channel 
P7 = width of channel 

P8 = parameter used in defining channel length modulation, see (9) , 
P9 = (2* s qtyV ox . 

Thus only PI, P3 and P9 appear directly in the list of parameters used in 
the Ihantola model. The appropriate expressions used for and L eff 
are 


eff 


where 


P3 , 

P3 x V 


CRIT 


P5 


V GS ' V To 

V = 

V CRIT P9 


2qP2 


V GS ' V To < V CRIT 
V GS ‘ V To > V CRIT , 


V Tq » PI + <f> + P9 


2 

J £n 


n i 


(7) 


( 8 ) 


and 
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The Spice 2 model has the property that the expressions for the 
current and the channel conductance are continuous functions and it is 
in a suitable form for adaptation to the SPICE circuit simulation program. 


(c) Brews Model 


There are a number of charge sheet models for the MOSFET but perhaps 
one of the easiest models to implement is that developed by Brews (1978). 
The principal assumption of the charge sheet model is that the current 
travels in a surface of zero depth at the interface between the semi- 
conductor and the gate insulator. This means the inversion layer is 
assumed to have zero thickness. 

In Brews (1978) the expression 

i D - qzu eff no) 

is derived for the drain current, where N(y) is the carrier density per 
unit area, d^/dy is the average quasi-fermi level gradient and the 
corrdinate y measures distance along the channel from source to drain. 
Brews next approximates d<j>^/dy by 


d<J>f d<}> 1 d 

17 = 17 ’ B d7 ( * nN) 


(ID 


where <f> s (y) is the potential along the oxide-silicon interface and it is 
the second term on the right hand side which is assumed to take into account 
diffusion current effects. Integration of (11) yields 

N(y) = N(o) exp (B 4> $ (y) - * S (°)J - B pfr f (y) - 4> f (o)~ ) > . 

On estimating d<t>.p/dy between (10) and (12) we obtain the result 


( 12 ) 
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This gives the carrier density in terms of <j> $ and to determine 

the electrostatic potential $ we have 
[ 0 in gate oxide 


V^(j» * < 


(p-N A ) in silicon , 

where p * n.-e” 8 * + 8< **f is the hole density in the semiconductor. The 
charge sheet model then assumes a boundary condition at x * 0 in the 
form 


(14) 


“ox ft] - s ft] * • 05) 

- L x=o_ -*x=o + 

where x * 0 is the interface between oxide and silicon, < QX is the 
permittivity of the oxide and x is the coordinate measured positive into 
the silicon. 

Now in the long channel approximation the solution of the Poisson 
equation (14) is simplified by assuming v 2 <j> » d 2 #/dx 2 . Hence, in the 
silicon. Integration of (14) once yields 


1 /&)' 
2 'dx' 


s ( ‘ r exp 


( + 8<Pf ) - N.<j» } + constant 


If the constant is chosen to satisfy <|> -► o , d$/dx -*• o as x « and 
exponentially small terms are neglected, we obtain for the boundary value 
needed in (15), 

* s ft] x-o + * -« n a l b C2(»,-n’‘ . 

where the Debye length Lg * (< s /BqN^) . More simply in the oxide we find 
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K Mi 
K ox dxj 


x«o_ 


C ox ^ V GS “ • 


Hence, the boundary condition (15) becomes. 


Cqx^gs'S^ = qN A L B {2 £^y)-i]> + qN(y) • 


06 ) 


Thus we have derived two equations (13) and (16) for <fr $ (y) and N(y). 
Brews (1978) suggests a method of eliminating N(y) to obtain a relation 
between I and <t> s . First equations(13) and (16) are differentiated 
and dN/dy eliminated between them. Then N(y) is eliminated from 
this result by using (16) again. In this way we obtain the expression 


P5 <( 1 + ev GS ) ( 


sL 


. 1 (♦' 


V so ^ 2 sL 


) 


r so 


3/2 

3/2 

(e$ sL -i) 

(e* so -i) 

1/2 

1/2—1 

(W sL -l) - 

K«-’> 


0 

_ 2P2 
30 * 

♦ 4 

e* 

where the parameters Pi are defined below and $ = <p $ (y=o) and 

4>si = 4> s (y=L) , the source and drain values of the potential respectively. 

This is the expression to be used in the Brews model but in the 
optimization process the current is required for given specified values 
of the drain and gate voltages. This means that in any model evaluation 
we must first compute the appropriate values of <(» sL . We also note that 
the parameters in the model are expressed in exactly the same form as in 
the Ihantola model if the same empirical expressions are chosen for y ^ 
(and if required). 
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Calculation of <t> 


so * 


The condition applied by Brews here is that 


♦ should be obtained from the one-dimensional Poisson equation when 


r so 

Vqs = 0 . He obtains 


/ - a B £i r 

'GS *so k I- 

P 


>S0’1 + exp (W so -3Pl) ] , 


if an exponentially small term is neglected. This is implicit in <p 
for given but we may write the equation in the alternative form 


so 


r so 


* 0P1 - an (eP2^) + An{[e(v GS -* so )] /[I + (e* eft -l)/-exp(e*_-BPl) ]} 


so 


r so 


Then Brews suggest the following iteration scheme, 

2 


+*o’ - PI - «n(eP2 2 )/S 


4o +1 1 ■ ♦so’ + 6 *" {Cb( V*»’ )] /[1 + (Wso’-'> «P(6*ii’-6P')]' • 
(i > 0) . 

It is this scheme which we use for determining <t> SQ * ^ S0 ( v gs^ * 

Calculation of After some discussion Brews uses the condition 

B4»sl " 6 *so + 0V DS + ln [N(D/N(0)] . 
to obtain <|> s ^ , where N(L) AND N(0) are obtained from (16). Hence 4> s ^ 
is implicity defined for specified V QS by the equation 


b , = d» + V n r + T in 

p sL v so DS 0 


6 * i(v GS-*sL ) . 


- P 2 ( B *so- 1)H 


( 17 ) 


A straightforward iteration scheme for this equation is 

4, (0) » 


p sL 


<fc + V 
v so 


DS • 
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*!"[• 




GS T sl * v T sL 
fa*"-*..) - P 2 ( 6 <„- 1)' 5 


» i > 0 


but this fails at large V 


DS 


GS y so' _ '^ T so 
because there the numerator of the log term 


in (17) tends to zero (near saturation) and hence the perturbation to 
if. 1 ^ becomes large. This prompted an investigation into the asymptotic 


08 ) 


T sL 

form of for large and equation (16) shows that its limiting 

★ 

value <|> is obtained from 

- ♦*) - P2(e$* -l) 5 * = 0 . 

More precisely from (17), for large Vp S 

B^Vgs - * sL ) - P2(64» sL -1 J* 5 * Ae' 6V DS + ... , 

for some A independent of . This is easily transformed into a 
quadratic equation for $ ^ having an appropriate solution in the form, 
* -BV r 


<fr sL = ♦ + Be 


'DS 


where B is known in terms of A . Substitution of this expression Into 
(17), written more conveniently as 

exp [ae sl -v DS f[ • o O gs - V - P2(M sL -D 1 *] . 

where 0 * esp(8» s0 ) / | e>i ( V GS" , so' * p2 ( e +so"'Q , gives the value of 
A . We obtain finally p -t 

] 


♦ . . / + 2<*"-v ss )|jfov 6 p so )- pj «* S o 


r. * 


K sL 


B^(2V GS - p 2 2 -2/) 


exp[6(4> -♦ s0 "V DS )J + 
(19) 


where, from (18) , 

,2 


♦* ■ V GS + T *f[ P 2 ^ 4 < V GS-?>’ • 
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★ 

and the negative sign is required because, for example, V Q g - $ must be 
positive from (18). 

In applying the model the criteron was selected whereby the asymptotic 
expression (19) should be used unless <f> - <j> SQ - is greater than some 
fixed tolerance (such as -3/6 , -4/6 , ... etc.) and this gave satis- 
factory results. We chose the same form for as in the Ihantola 

model. However, the Brews model is valid in sub- threshold and saturation 
regions so it was decided not to include any empirical form of channel 
modulation. This has the effect of reducing the number of parameters to 
six and in the foregoing it has been assumed that P6 = Vpg . Through an 
oversight the parameter Vpg (which enters the current through a simple 
translation of Vq^) was omitted in the model except where it enters 
y eff * frurther modification to remedy this fault and also to include 
substrate bias effects (Vgg is assumed to be zero in the above anlaysis) 
are proceeding but are not available for this report 


e 



Chapter 3 


OPTIMIZATION METHODS 


Let £ be the parameter vector associated with the models Ihantola (I), 
Spice 2 (c) or Brews (b), so that each component of £ is one of the para- 
meters of (I), (S) or (B) respectively. We make the convention that the 
i-th component of £ corresponds to the i-th parameter as it appears in 
either one of the three equations. Recall that for different values of the 
gate voltage, Vq^, we are provided with several experimentally obtained 
pairs of values of the source to drain current, Ig, versus the drain voltage 


V 


Dj’ 


r 

V DSi* I Di 

s 

Typically m - 20, n = 4. 
scalar function 


i — 1,2,*.. ,m 

i j - 1 ,2,. . . ,n 

Gj 

Therefore to every P we can associate the 



where Ig- (_P) is the model -predicted current at the drain voltage 
corresponding to the gate voltage Vqj. Such a scalar function, F, can be 
constructed for each device for which V QS , I Q values are provided and for 
each of the three mathematical models of the device response. Our goal is 
to estimate £ so that F(£) is minimized with £ belonging to a set of 
physically acceptable vectors. This is recognized as a constrained non- 
linear least squares problem in the components of £. 

Various iterative methods exist to expedite this minimization or vector- 
optimization process. One, called Steepest Descent (McCormick 0983)), 


searches for a minimum in the direction Of the negative gradient of F(£) 


(a function of several variables decreases most rapidly along the direction 
of the negative gradient vector) and then adjusts for step length along this 
vector. This process is repeated for each successive iteration. Steepest 
Descent is quite stable - convergence is assured; however, convergence is too 
slow for practical use. A faster method is Newton's Method (McCormick (1983)) 
which relies on the Taylor expansion of the error function with respect to 
£. A modification of this, known as Gauss-Newton , converges rapidly but 
lacks the stability of Steepest Descent (i.e., convergence is not guaranteed). 
The algorithm known as Levenberg-Marquardt (Levenberg (1944), Marquardt 
(1963), McCormick (1983)) is an interpolation between Gauss-Newton and Steepest 
Descent in that search direction and step length are modified simultaneously 
For this reason, its stability and rapid convergence inherent from each of 
the two previously mentioned methods, Levenberg-Marquardt is recognized to 
be one of the most efficient algorithms available. For a more detailed numeri- 
cal analysis of the iterative (algebraic) formulation and convergence consi- 
derations of these gradient methods see Morris and Everson (1984). 

Also described in detail in this report are the two primary programs. 

SARAH and MOSES. They use IMSL subroutines invoking the above described 
Gauss-Newton and Levenberg-Marquardt methods to expedite the minimization 
procedure. SARAH is given a number of initial values or starting points 
from which to search over a prescribed hyper-rectangle, S, of the parameter 
space. The selection of the extreme points of the hyper- rectangle is guided 
by an a priori estimate of the range of the different parameters. SARAH 
sifts through the data employing a constrained Gauss-Newton method to locate 
what is hoped to be the global minimum. The multi-dimensional parameter 
space is seeded with minima; by sorting through a wide range of data SARAH 



locates a number of these minima and then selects the deepest as the global 
minimum. This, then, is fed to MOSES which uses an unconstrained Levenberg- 
Marquardt algorithm to improve upon the convergence and accuracy of the 
vector P. to minimize the error function F(P). It should be emphasized 
that the convergence to such a vector (within a specified tolerance) subse- 
quent to the MOSES program does not offer assurance that this indeed is the 
global minimum. In fact there is no clear cut way of determining this. More 
over the value of P. provided by MOSES may no longer be a vector P. be- 
longing to the set, S, of acceptable vectors (see Section 5 for a more ex- 
tensive discussion of this issue). However, since MOSES improves upon the 
constrained minimization provided by SARAH, the minimum obtained, according 
to the specified criteria of tolerance, usually lies within S or it is 
sufficiently close to it. 

Program MOSES stops its search whenever one of the following tolerance 
criteria is satisfied: 

(a) On two successive iterations the parameter estimates 
agree, component by component, to within a specified 
number of significant digits 

(b) The norm of the gradient vector is within a specified 
tolerance 

(c) On two successive iterations the error function F(P) 
differs by some prescribed small amount e, 


i .e. , 


F(P k+1 ) - F(P k ) 


e 


The size of the Root Mean Square (RMS) error expressed in percentage 
terms is chosen as the main accuracy criterion for deciding whether or not 



a given value of £ is acceptable. Recall that the RMS error is a quantifi- 
cation of accuracy based upon an averaging of error distributed over all data 
points, say N. The formula for RMS error is given by: 


RMS ERR * 
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I 


Dj 



Less than ten percent RMS error from either SARAH or MOSES for a given 
device is regarded as a satisfactory result (RMS MOSES < RMS SARAH). 



Chapter 4 


RESULTS 


This chapter is divided into three sections giving the main results for 
the three models tested, Ihantola, Spice 2 and Brews respectively. Compari- 
sons between the models are presented and discussed in Chapter 5. 

(a) IHANTOLA MODEL 

The Ihantola model provides accurate results, at least when Vq * 2 
and Vg£ = 0. The accuracy remains surprisingly good even for small MOSFETs. 
Table 1 gives the relevant data concerning the Ihantola model. 

As anticipated, the RMS error corresponding to devices of greater 
channel length and width is substantially less than that associated with 
smaller devices. That is to say, accuracy of the mathematical model's fit 
to the data (to currents produced experimentally) provided by extraction of 
the appropriate estimated parameter vector _P, generally increases as the 
channel's length and width increase. Table 2, extracted from Table 1, con- 
firms this result. 

Notice that the best fit is usually provided when length/width = 1. 



TABLE 1 


RMS ERROR/ PARAMETER VALUES FOR IHANTOLA MODELS 


DATA 

SET 

DIM jiMs 


PI 

P 

P2 

A R A 
P3 

M E 
P4 

TER 

P5 

S 

P6 

P7 

RMS 

ERR% 

101 

1.2 x 2.5 

(S) 

.40 

1.3 

1.1 

.50 

.014 

.051 

-.40 

5.7 



(M) 

1.2 

3.4 

1.1 

.11 

.0065 

.083 

-5.0 

3.9 

102 

2.5 x 2.5 

(S) 

.40 

1.1 

.61 

.17 

.0080 

.044 

-.40 

4.6 



(M) 

.95 

1.7 

.61 

.083 

.0060 

.054 

-2.1 

4.5 

103 

1.2 x 5.0 

(S) 

.40 

1.2 

1.2 

.50 

.036 

.049 

-.40 

7.1 



(M) 

1.5 

3.7 

1.2 

.11 

.017 

.087 

-6.0 

4.9 

104 

2.5 x 5.0 

(S) 

.40 

1.1 

.75 

.22 

.020 

.044 

-.40 

4.7 



(M) 

2.0 

2.0 

.74 

.093 

.014 

.066 

-4.4 

4.6 

N12 

13.5 x 13.5 

(S) 

.45 

.81 

.93 

.036 

.0049 

.021 

.39 

0.25 



(M) 

.59 

.84 

.93 

.036 

.0049 

.023 

.63 

0.22 

N13 

13.5 x 4.5 

(S) 

.40 

.87 

.67 

.033 

.0018 

.022 

-.33 

0.30 



(M) 

.36 

.86 

.67 

.033 

.0018 

.021 

-.26 

0.31 

ANl 1 

3.0 x 3.0 

(S) 

.40 

1.3 

1.1 

.18 

.0062 

.065 

-.40 

3.2 



(M) 

1.5 

2.3 

1.1 

.066 

.0043 

.090 

-3.7 

1.9 

AN12 

24.0 x 24.0 

(S) 

.41 

.88 

.65 

.040 

.0063 

.014 

-.21 

0.24 



(M) 

.40 

.88 

.65 

.040 

.0063 

.014 

-.19 

0.24 

ANl 3 

3.0 x 24.0 

(S) 

.40 

1.2 

.64 

.22 

.087 

.071 

-.40 

3.2 



(M) 

1.3 

2.1 

.62 

.086 

.061 

.094 

-3.2 

1.9 

AN21 

2.5 x 2.5 

(S) 

.40 

1.1 

.60 

.17 

.0083 

.045 

-.40 

2.8 



CM) 

1.5 

1.9 

.60 

.078 

.0061 

.062 

-3.2 

1.7 

AN22 

2.5 x 5.0 

(S) 

' .40 

1.1 

.78 

.22 

.020 

.045 

-.40 

3.2 



(M) 

2.1 

2.1 

.78 

.092 

.014 

.069 

-4.6 

1.8 

AN23 

2.5 x 10.0 

(S) 

i .40 

.93 

.60 

.19 

.046 

.041 

-.40 

2.1 



(M) 

i 1.0 

1.3 

.60 

.13 

.039 

.052 

-1.9 

1.5 


All data for VBS = 0 , VG * 2 
For SARAH: NSIG = 2 , NSRCH * 100 , 
For MOSES: NSIG = 3 . 






TABLE 2 


DATA SET 

CH L/CH W 

RMS SARAH/RMS MOSES 

101 

1.2/2. 5 

5. 7/3. 9 

102 

2. 5/2. 5 

4. 6/4. 5 

AN21 

2.5/2. 5 

2. 8/1. 7 

AN23 

2.5/10 

2. 1/1. 5 

N13 

13.5/4.5 

0.30/0.31 

N12 

13.5/13.5 

0.25/0.22 

ANI2 

24/24 

0.24/0.24 


These above results neglect VG * 2 , which, if included, produces 

unacceptable errors (90% RMS or greater). Both non-diffusive models, 
Ihantola and Spice 2, neglect consideration of this gate voltage in the 
scheme of extracting parameters (see Chapter 5 for further discussion). 

Another factor influencing accuracy involves the number of initial 
values or starting guesses. The SARAH program prompts for specification 
of the number of initial values - then distributes these (internal to the 
program) as starting points with which to commence the search for minima 
over the hyper-rectangle of seven dimensional parameter space. It would 
seem intuitively clear that a larger field of initial starting points 
would produce greater accuracy in obtaining an optimal parameter estimate; 
certainly an increase in the number of starting guesses improves the 
probability of locating the global minimum. And whether or not the global 
minimum is in fact the minimum SARAH ultimately calculates is not 



24 


particularly relevant. SARAH'S process of sifting through the seven 
dimensional parameter space in search of a number of minima - then 
selecting the deepest as the global minimum - would indicate that a 
larger assortment of starting values provides greater flexibility in the 
choice of this deepest minimum which is eventually handed over to the 
MOSES program. SARAH, however, consumes a great deal of time in the 
process of computing the optimal parameter estimate. Our initial selection 
of just one starting value necessitated allowing for large blocks of 
turnaround computing time. To expedite this process with greater efficiency 
data sets for devices of varying channel length and width were run con- 
tinuously in batch mode . This avoided having to input each set individually 
and wait for long periods of time before running the next set. In this 
fashion we were able to obtain the more accurate results from SARAH using 
up to TOO initial values. 

A sensitivity analysis was performed with respect to each of the 
seven parameters of the Ihantola model. If we view our parameter extraction 
process as essentially that of best-fitting response curves to experimental 
data then in this light it is prudent to consider the following situation. 
Suppose we have obtained our parameter £ enabling minimization of the 
error function F(P) and therefore a best-fit of the response curve to 
the data. Tn what way does a small perburbation of just one of the 
parameters (Pi, the ith component of the parameter vector £ , 1 <i< 7) 
effect changes in the accuracy of the response curve's fit to the data? 

A graphic analysis provides visual insight to the fashion with which 
a ten percent increment and decrement alter the fit of these response 
curves. It is found that a ±10% perturbation of the first parameter. 
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# 


the doping parameter, indicates a rather significant change of the 
curve (from the curve with optimal parameter estimates) for gate voltages 
of 3, 4 and 5. In this respect we regard PI as quite sensitive. In 
contrast, we find similarly that the same perturbation of the sixth para- 
meter, P6 , results in an insignificant change of the response curves 
indicative of great insensitivity. 

In Figures 3-6 graphs are drawn which provide an illustration of the 
model's sensitivity (for two particular data sets at zero substrate bias) 
to small perturbations of the seven parameter components. For the data 
set corresponding to the long-channel device (24 x 24yMs), P3 and P5 are 
found to be the most sensitive; a 10% change (increase and decrease) in 
the value of these parameters produces close to 10% RMS error of the model 
to experimental data (RMS error for the optimal parameter set was 0.24%. 

The effect of such a change in P3 is illustrated in Figure 3. The same 
perturbation of P6 , illustrated in Figure 4, seen as the least sensitive 
of the parameters, produced 1.25% RMS error. Similarly, for data set AN21 
(as an example of a shorter channel device, 2.5 x 2.5 Ms, a 10% increase 
and decrease of P7 results in a 15.5% and 17.3% RMS error - a significant 
deviation from the model's best-fit of 1.7% RMS error. 

Again, P6 is seen to be the least sensitive of the seven parameters. 
A 10% perturbation of this parameter produces a 1.8% RMS error of the model 
to experimental data - less than a 0.01% change from the model's best-fit. 
The behavior with respect to these two parameters is depicted in Figures 5 
and 6. 

Eigenvalues of the Hessian matrix provide further information with 
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regard to the sensitivity analysis. However, MOSES failed to output 
eigenvalues consistently for reasons presently not understood. This 
question of sensitivity to (or redundancy of) a given component of £ 
deserves further investigation. Future analysis should aim to provide a 
complete picture of this important issue (see Chapter 5 for additional 
comments) . 

Table 2 also compares parameter components, PI - P7 , (with listed 
RMS percentage error) from the optimization of the twelve data sets tested. 
Notice that the parameter values for data sets 102 and AN21 , two different 
experimental sets of data provided for devices of the same dimension (2.5 
x 2.5 tiMs), closely coincide. Similarly, parameter values for data sets 
104 and AN22 (both 2.5 x 5.0 yMs) are roughly equivalent (although the 
MOSES estimate of the vector £ for data sets 102 and 104 has more than 
twice the RMS error than the corresponding parameter estimates for AN21 
and AN22 respectively). Examination of the parameter component pertaining 
to the capacitance, P5 , reveals increment of the device's width (i.e. 
equal length, varying width) to be proportional to (similarly increasing) 
values of this particular parameter. For example, data sets 101 and 103 
of dimension 1.2 x 2.5 yMs and 1.2 x 5.0 yMs respectively correspond to 
P5 values of .0065 and .017. And data sets 102 and 104 (2.5 x 2.5 yMs 
and 2.5 x 5.0 yMs) have P5 values of .0060 and .014. Similarly for 
AN21 - AN23 (2.5 x 2.5 yMs, 2.5 x 5.0 yMs, and 2.5 x 10.0 yMs) values pro- 
duced are .0061 , 014 and .039. And for N13 and N12 (13.4 x 4.5 yMs and 
13.5 x 13.5yMs) P5 is .0018 and .0049. A three-fold increase in the 
device's width produces a similar increase in the parameter value 


27 


(consistent with what we might expect on physical grounds). 

Finally, we note that the greatest accuracy of parameter estimation 
(i.e. the least RMS error) occurs when V gs = 0. Increasingly negative 
V R<; values yields greater error and diminished accuracy. Table 3 


itemizes Ihantola RMS error for various V, 


DATA SET V B$ = 0(M) V gs = 2(S) V g$ = -2.5(S) V g$ = -5(S) 



101 

3.9 




102 

4.5 



• 

103 

4.9 




104 

4.6 




AN1 1 

1.9 

4.2 

13 

• 

AN1 2 

.24 

.82 

11 


AN13 

1.9 

3.9 

7.5 


AN21 

1.7 

3.5 

7.6 

• 

AN22 

1.8 

4.1 

6.5 


AN23 

1.5 

2.3 

4.0 


Nl 2 

.22 

.68 

3.3 

• 

Nl 3 

.31 

.92 

5.4 


121 



9.4 


122 



5.2 

m, 

123 



11.0 


124 


TABLE 3 

5.6 

• 

RMS percentage errors 
different values for 

for the Ihantola model 

V BS, ( V 2) * 

for each data set and 

• 
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TABLE 4 

RMS ERROR/PARAMETER VALUES FOR SPICE 2 MODEL 
All data for V gs = 0, V Q /2 

For SARAH : NSIG = 2, NSRCH = 100 , 

• For MOSES : NSIG = 3 


DATA DIM yMs PARAMETERS RMS 



Length 
x Width 


PI 

P2 

P3 

P4 

P5 

P6 

P7 

P8 

P9 

ERRORS * 

101 

1 .2x2.5 

(S) 

-1.9 

3.5 

1 .2 

0.46 

0.50 

0.97 

3.0 

8.6 

2.0 

5.2 


(M) 

-2.1 

2.8 

1 .2 

0.40 

0.52 

0.80 

3.0 

14 

2.2 

4.8 

102 

2. 5x2. 5 

(S) 

-1.9 

4.9 

0.81 

1.0 

0.47 

2.1 

2.7 

8.6 

1.7 

4.8 


(M) 

-1.9 

4.1 

0.80 

0.95 

0.55 

1.8 

2.7 

56 

1.7 

4.8 

103 

1 ,2x5 

(S) 

-1.4 

24 

0.83 

1.5 

0.5 

1.4 

4.3 

1.5 

1.4 

8.4 


(M) 

-1.3 

26 

0.91 

1.4 

0.73 

1.3 

4.6 

1.5 

1.5 

6.9 

104 

2.5x5 

(S) 

-1.6 

7.6 

1.1 

0.94 

0.25 

2.1 

4.8 

1.5 

1.5 

4.9 


(M) 

-1.6 

7.5 

1 .1 

1.0 

0.29 

2.1 

4.8 

2.3 

1.5 

4.9 

N12 

13. 5x3 . 5 

(S) 

-0.78 

0*27 

0.63 

0.78 

0.13 

16 

11 

1.5 

0.8 

0.67 


(M) 

■O’. 78 

0.27 

0.63 

0.78 

0.13 

16 

11 

1.5 

0.8 

0.67 

N13 

13.5x4.5 

(S) 

-0.93 

3.5 

1.2 

1.1 

0.064 

15 

3.8 

1.5 

1.0 

0.78 


(M) 

-0.93 

3.5 

1.2 

1.1 

0.064 

15 

3.8 

1.5 

1.0 

0.78 

ANl 1 

3x3 

(S) 

-1.5 

29 

1.0 

0.37 

0.24 

3.4 

3.1 

1.5 

1.5 

3.3 


(M) 

-1.5 

29 

1.0 

0.37 

0.24 

3.4 

3.1 

1.5 

1.5 

3.3 

AN! 2 

24x24 

(S) 

-1.1 

18 

0.67 

3.3 

0.11 

28 

20 

1.5 

1.1 

1.1 

(M) 

-1.1 

18 

0.67 

3.3 

0.11 

28 

20 

1.5 

1.1 

1.1 

ANl 3 

3 x 24 

(S) 

-1.5 

1.5 

1.2 

8.6 

0.43 

2.4 

29 

1.5 

1.4 

4.6 

(M) 

-1.6 

1.5 

1.3 

8.6 

0.43 

2.4 

29 

1.5 

1.5 

4.3 

AN21 

2. 5x2. 5 

(S) 

-1.9 

34 

0.45 

1.7 

0.16 

2.2 

2.1 

1.5 

1.6 

2.4 


(M) 

-1.9 

34 

0.45 

1.7 

0.16 

2.2 

2.1 

1.5 

1.6 

2.4 

AN22 

2.5x5 

(S) 

(M) 

-2.0 

-2.0 

28 

28 

0.51 

0.51 

2.0 

2.0 

0.26 

0.26 

2.3 

2.3 

5.9 

5.9 

1.5 

1.5 

1.6 

1.6 

1.9 

1.9 

AN23 

2.5x10 

(S) 

-2.0 

3.2 

1.1 

0.83 

0.20 

2.0 

12 

8.6 

1.6 

1.1 

(M) 

-1.9 

2.0 

1.1 

0.72 

0.33 

1.6 

12 

77 

1.5 

0.92 


In common with the Ihantola model it was found that large errors arise 
from incorporating the V Q = 2 data. Hence the RMS errors quoted in this 
table all refer to cases with V Q = 2 excluded and for V gs = 0 . The 
general pattern of results follows that for the Ihantola model described 
in the previous section. The result for long MOSFETs generally show small 
RMS errors and these increase to a maximum for the shorter MOSFETs. A 
similar trend with regard to the width of device can be seen from data sets 
AN21 , AN22, AN23 where for a length (2.5 yMs) the RMS error decreases as 
the width increases (see Table 5). 

TABLE 5 

Variation of RMS percentage error with width for fixed length 
L = 2.5 yMs , V BS = 0 


DATA 

W(yMS) 

RMS (%) 

AN21 

2.5 

2.4 

AN22 

. 5.0 

1.9 

AN23 

10.0 

0.92 


The Spice 2 model contains nine parameters as described in Chapter 2. 
Included in Table 4 are the values of these parameters, extracted to two 
significant figures, by programs SARAH and MOSES from the different data 
sets. There are two pairs of data sets relating to different devices of 
the same dimensions, namely AN22 and 104 (2.5 x 5.0 yMs) and AN21 and 102 
(2.5 x 2.5 yMs), but the results show that different RMS errors are ob- 
tained for each member of a pair. Moreover, there can be wide variations 
in the optimal values of the parameters. In this respect P2 and P8 
are particularly noteworthy examples. Now P2 is the doping parameter 
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defined by (6) and a wide variation over different devices is under- 
standable. However, P8 is a parameter (V max ) representing the maximum 
drift velocity of the carriers and contributes to the channel length 
shortening (Vladimirescu and_Liu(l 980) ). The fact that it takes such 
widely differing values for similar devices would appear to point to a 
shortcoming of the model. 

The parameters P6 and P7 , representing the length and width of 
the channel respectively, were, on physical grounds, constrained to lie 
within ±20% of the actual device dimensions during operation of the program 
SARAH. As shown in Table 4 the unconstrained optimal values predicted by 
MOSES are generally close to the SARAH values. 

An important parameter is PI , the flatband voltage Vpg , and in 
all cases this turns out to have a negative value (for Vgg = 0) of 
magnetude of 0(1) . In some cases run with Vgg * -5 a positive Vpg 
results. 

The effect of changing V BS from zero to different negative values 
is summarized in Table 6 (analogous to Table 3 for the Ihantola model), 
where RMS percentage errors are quoted from running the SARAH and MOSES 
programs with each of the data sets. The same general trend is observed 
as before, that the fit of the model to the data becomes less accurate 
as l V Bsl is increased. 

The sensitivity analysis with respect to the parameters, described in 
part (a) for the Ihantola model, was also carried out for the Spice 2 
model for the same two data sets for a 'large' MOSFET (24 x 24 jiMs) and a 
"small' MOSFET (2.5 x 2.5 uMs). The results are consistent in the sense 



that the most sensitive and the least sensitive parameters are the same 
for each size of MOSFET. A sensitive parameter is P9 (coresponding to 
the parameter P2 of the Ihantola model) and this is illustrated for both 
data sets in Figures 7 and 8. By contrast the model is insensitive to 
the parameter P4, occurring in the mobility law, and the associated 
graphs are drawn in Figures 9 and 10. 


TABLE 6 

RMS percentage errors for the Spice 2 model for each data set and 


different values of V gs (V Q * 2) . 


DATA SET 

V BS ' 0 

V BS ' - 2 

V BS = - 2 ' 5 

V = -5 

101 

4.8 




102 

4.8 




103 

6.9 





4.9 




ANll 

3.3 

4.3 


16 

AN12 

1.1 

1.6 


17 

AN13 

4.3 

4.0 


11 

AN21 

2.4 

4.5 


11 

AN22 

1.9 

3.8 


6.6 

AN23 

0.92 

7.1 


6.2 

Nl 2 

0.67 

1.3 


4.0 

N13 

0.78 

1 .5 


5 .8 (S ) 

121 



7.0 


122 



5.1 


123 



9.4(S) 


124 



6.1 



(C) BREWS MODEL 

The principal objective in setting up the Brews model is to create a 
model that includes the effects of diffusion current so that the model can 
predict drain current for low values of gate voltage. The data with which 
we worked had four gate voltages, viz. 2, 3, 4 and 5. With the Ihantola 





and Spice 2 models very poor fit to the V QS » 2 data prompted us to 

exclude these data sets when trying to fit the model to the data. When 

this is done a fairly low RMS error results. The Brews model is much 
better than Ihantola or Spice 2 at finding a fit to data that includes 
V GS * 2 but in general does not result in lower RMS error values than 
the Ihantola and Spice 2 when * 2 is excluded. 

As might be expected the Brews model is noticeably longer in terms 

of CPU time spent in calculating the optimal parameter set. This is due 
primarily to the iteration schemes used to calculate the voltages at the 
end points of the charge sheet. Accurate figures are not possible as no 
software exists at the time of this work to measure actual CPU time. 

A summary' of the results obtained from the Brews model is given in 
Table 7 in analogy with Table 1 and Table 4 for the other models. From 
Tabel 7 one feature stands out rather strongly, namely that frequently 
MOSES gives little or no improvement in the SARAH results. In fact in 
two cases (101, AN13) the RMS error actually increased. This situation 
occurred also from time to time in the development phase of the other 
models where it was explained in terms of an inconsistent use of the 
Library routine parameter NSIG (the number of significant figures to 
which the model parameters must agree on successive iterations in order 
to stop the program) but in this case NSIG=2 for SARAH and NSIG=3 for MOSES. 
Hence it is expected that the RMS error in MOSES should generally be less 
than that in SARAH. Further numerical experimentation to investigate 
this phenomenon is indicated. 



TABLE 7 


RMS ERROR/PARAMETER VALUES FOR BREWS MODEL 
DATA 


SET 

DIMyMs 

. 

PI 

P2 

P3 

P4 

P5 

P6 

RMS ERR% 

101 

1.2 x 2.5 

(S) 

.12 

1.2 

.72 

.49 

.022 

.49 

7.8 



(M) 

.072 

1.3 

.72 

.48 

.022 

.47 

8.5 

102 

2.5 x 2.5 

(S) 

.35 

.78 

.69 

.40 

.012 

-.15 

6.1 



(M) 

.17 

.91 

.70 

.21 

.0084 

-.39 

5.4 

103 

1.2 x 5.0 

(S) 

.11 

1.2 

1.1 

.48 

.035 

1.1 

8.9 



(M) 

.044 

1.2 

1.1 

.43 

.034 

.98 

8.9 

104 

2.5 x 5.0 

(S) 

.15 

.89 

.80 

.25 

.022 

-.26 

5.5 



(M) 

.15 

.89 

.80 

.25 

.022 

-.26 

5.5 

Nl 2 

13.5 x 13.5 

(S) 

.17 

.77 

.60 

.062 

.0086 

1.3 

2.5 



(M) 

.17 

.77 

.60 

.062 

.0086 

1.3 

2.5 

Nl 3 

13.5 x 4.5 

(S) 

.11 

1.2 

.61 

.47 

.0071 

-.40 

2.4 



(M) 

.22 

.70 

.60 

.060 

.0025 

-.96 

0.94 

AN11 

3.0 x 3.0 

(S) 

.15 

1.1 

.79 

.18 

.0092 

-.34 

5.6 



(M) 

.15 

1.1 

.79 

.18 

.0093 

-.36 

5.4 

AN1 2 

24.0 x 24.0 

(S) 

.28 

.76 

.82 

.060 

.0057 

-.25 

0.56 



(M) 

.28 

.76 

.82 

.060 

.0057 

-.25 

0.56 

ANl 3 

3.0 x 24.0 

(S) 

.21 

.90 

1.2 

.19 

.037 

1.3 

5.3 



(M) 

.17 

.93 

1.2 

.17 

.036 

1.3 

5.5 

AN21 

2.5 x 2.5 

(S) 

.34 

.78 

1.3 

.48 

.0073 

-.34 

5.7 



(M) 

.18 

.92 

1.3 

.25 

.0051 

-.73 

4.6 

AN22 

2.5 x 5.0 

(S) 

.15 

.90 

1.1 

.26 

.017 

-.40 

4.9 



(M) 

.15 

.90 

1.1 

.27 

.017 

-.40 

4.9 

AN23 

2.5 x 10.0 

(S) 

.21 

.69 

1.1 

.22 

.025 

.74 

3.6 



(M) 

.21 

.69 

1.1 

.22 

.025 

.74 

3.6 


All data for V B$ = 0 , VG * 2 included 
FOR SARAH: NSIG = 2, NSRCH = 100 
FOR MOSES: NSIG = 3 


We note from Table 7 that data sets for MOSFETs of the same size 
have approximately the same RMS error. For example, the data sets 102 
and AN21 (2.5 x 2.5 uMs) have RMS errors of 5.4% and 4.6% respectively. 
Similar results hold for the 104 and AN22 data sets (2.5 x 5 uMs). 

Further, note that as the channel width increases for fixed length (data 
sets AN21 , AN22, AN23) progressively decreasing error are obtained. This 
is to be expected as the increasing width would make the one-dimensional 
model assumption used in the Brews model more valid. 

A sensitivity analysis was performed using the AN12 and AN21 data sets. 
Table 8 gives the RMS percentage errors which result from a ±10% change in 
each of the optimal parameters in turn. 


TABLE 8 

RMS percentage errors for parameter variation of ±10% from the optimum 



AN12 

(RMS=0. 56%) 

AN21 

(RMS=4.6%) 

MULTIPLIER 

0.9 

1.1 

0.9 

1.1 

PI 

4.06 

3.56 

6.05 

4.29 

P2 

6.76 

6.04 

9.51 

6.62 

P3 

9.92 

10.13 

9.42 

12.76 

P4 

1.77 

1.62 

7.84 

5.39 

P5 

9.92 

10.13 

9:42 

12.76 

P6 

0.60 

0.55 

5.23 

4.38 


In Table 8 it can be seen that the most sensitive parameters are 3 and 5 


but it i£ informative to note that the changed RMS errors are identical 
for P3 and P5. The parameter showing the least amount of change is P6 
and again we note that increasing the value of P6 results in an improved 
RMS error in both the AN12 and AN21 data sets. This is the only place 





where an improvement of fit results from a perturbed parameter. 

Again as with the other models the sensitivity results are depicted 
graphically. For each data set quoted in Table 8 the fit for ±102 
variations in the most sensitive (P5) and least sensitive (P6) parameters 
is displayed in Figures 11-14. 

As mentioned in Chapter 2 (c) the Brews model used here has not been 
adapted to include V &s # 0. Further a correction to include flat band 
effects more fully should be incorporated into the model. 



Chapter 5 CONCLUSION AND DISCUSSION 

Three topics are presented in this last chapter. 

(a) An overview of the three models used by the team to extract the 
relevant parameters of a MOSFET; 

(b) Relevant features of the sensitivity problem and trends of all 
models with respect to it; 

(c) Suggested lines of future inquiry on the basis of our team 
knowledge of the behaviour of the devices, of the limitations 
of the proposed models and of the complexity of the required 
numerical investigations. 

(a) General Overview 

The Ihantola model (as implemented by the SARAH and MOSES programs) 
provides a more accurate estimation of the parameter vector £ than 
Spice 2. Comparing long-channel devices, we note that corresponding to 
dimensions of 24 x 24 yMs, 13.5 x 13.5 yMs and 13.5 x 4.5 yMs the RMS 
Ihantola MOSES errors at V gs = 0 are 0.24, 0.22 and 0.31 percent respec- 
tively. For Spice 2 the errors are 1.1, 0.67 and 0.78. For the Brews 
Charge-Sheet model we have 0.40, 2.5 and 0.94 percent with data at 
V GS " 2 included in the parameter extraction. For the two non-diffusive 
models, Ihantola and Spice 2, we exclude evaluation at the lowest gate 
voltage since parameter estimates of these two models at this particular 
gate voltage increase the RMS error beyond acceptable accuracy (up to 80 
and 90 percent). However, comparison exclusive of evaluation at V Qg = 2 
for the non-diffusive models substantiate Ihantola as somewhat more accurate 
than Spice 2 also for short channel devices. For example for devices 


of dimensions 1.2 x 2.5 yMs, 2.5 x 2.5 yMs and 2.5 x 5 yMs the respective 
Ihantola/Spice 2 RMS MOSES errors are 3. 9/4. 8, 1.7/2. 4 and 4. 6/4. 9 percent. 

The consistency of MOSES improving upon the RMS error of SARAH (as 
expected) in all three models is well established (with two unexplained 
exceptions for the Brews model, see Chapter 4(c)). However, in some in- 
stances MOSES fails to converge (to a particular PJ within the set maximum 
number of model evaluations (2000); instead, although improving upon SARAH* 
result, iterations exhibit a slow oscillatory type of behavior. When the 
routine is interrupted after the oscillatory behavior is observed, the 
resulting parameter estimate is well within acceptable accuracy and (as 
noted) improves upon the SARAH estimate. 

Also, for reasons not understood, eigenvalues of the Hessian 
(significant in ascertaining the sensitivity of a model to a perturbation 
of one or more of its parameter's components, see (b) below) frequently 
fail to be output in MOSES with no apparent regularity or consistency in 
all of the models tested. 

Intrinsic to the SARAH program are the variables NSIG and NSRCH, 

NSRCH prompts for a specification of initial values or starting guesses for 
SARAH to begin its search over the multi -dimensional parameter space for 
the global minimi urn. In all three models a larger value of NSRCH produced 
greater RMS accuracy. 100 was the typical value specified for NSRCH: a 
test with NSRCH equal to 300, however, did not provide greater accuracy. 
NSIG is a tolerance criterion that sets the number of significant figures 
of each component of £ which are required to coincide in two successive 
iterations. In Ihantola, Spice 2 and Brews NSIG was set to be 2 for SARAH 


and 3 for MOSES. The improvement of Sarah accuracy gained from setting 
NSIG=2 (from previous trials using NSIG*1) demanded in turn more CPU time. 
This was true again for all three models incorporated into SARAH and MOSES 
as subroutines. 

Table 9 compares accuracy (RMS percentage error) of the Ihantola and 
Spice 2 models at different substrate biases (at Vg^O, Brews is included). 
We can see that an increasingly negative substrate bias produces greater 
error and that the Ihantola model performs generally better than the 
Spice 2 model over all data sets at the Vgg values tested. 


TABLE 9 

Comparative RMS errors for Ihantola, Spice 2 and Brews models 
at different substrate biases 


v BS-° v es - 2 V BS *- 5 


DATA 

_1I) 

(S) 

m 

_ (I) 

(S) 

(I) 

(S) 

101 

3.9 

4.8 

8.7 





102 

4.5 

4.8 

5.4 





103 

4.9 

6.9 

8.9 





104 

4.6 

4.9 

5.5 





Nl 2 

.22 

.67 

2.5 

.68 

1.3 

3.3 

4.0 

Nl 3 

.31 

.78 

.94 

.92 

1.5 

5.4 

5.8 

ANl 1 

1.9 

7.4- 

5.4 

4.2 

4i3 

12 

16 

AN12 

.24 

1.1 

.56 

.82 

1.6 

11 

17 

AN13 

1.9 

5.4 

5.5 

3.9 

4.0 

7.7 

11 

AN21 

1.7 

2.4 

4.6 

3.5 

4.5 

7.6 

11 

AN22 

1.8 

1.9 

4.9 

4.1 

3.8 

6.5 

6.6 

AN23 

1.5 

.92 

3.6 

2.3 

7.1 

4.0 

6.2 


(l)-IHANTOLA (S)=SPIC£ 2 (B)=BREUS 


(b) Sensitivity 

An analysis of sensitivity is possible by looking at the eigenvalues 
of the Hessian, the matrix of second partial derivatives, of the sum of 
squares function at an optimal parameter set. 

Let 

S (£) - " C I*(P) - I< ] 2 

j=l J J 

be the sum of squares function where (Pl,...,Pk) is the k-dimensional 
parameter vector. Then a Taylor expansion of S (IP) gives 

S(P :-6P) « S(P) * S(P) + VS(P).6P + (6P) T H(P)6P + o(l6Pl 2 ) 

where H(P) is the k x k Hessian matrix, i.e. H. . (P) * a^/dP^aP^ 
where H^(£J is the (i,j) entry in the matrix. Now at a minimal point, 
P* , vS(P*) * 0 . (In our calculations |vS(£*)|~ 10 -11 ) which is 
much smaller than the typical largest eigenvalue of the Hessian which was 
approximately 10"^). Using this result we find that 
S(P* + 6P) - S(P*) = (6P) T H 6P + o(|6P| 2 ) 

Dividing both sides by |6PJ , the norm of the perturbing term, we obtain 
a formula for the relative change 
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This result is quite illuminating as the Hessian is first of all a 

symmetric real matrix so that its eigenvalues are all real and the 

corresponding eigenvectors are orthogonal. But since P* is assumed to 

be a minimum of S(P) it follows that all' the eigenvalues of H (P) 

♦ 

must be non-negative. If we further assume P. to be an Isolated 
minimum then eigenvalues are strictly positive and H (P) is a positive 
definite quadratic form. It then follows that the left hand side of (20) 
would take on isolated maxima when 6P were an eigenvector and the 
maximum values would be |{6P)j| where (6P)^ is an i— eigenvector 
and x.j is its associated eigenvalue. Thus from equation (20) for 
|6 P| * 1 the maxima would occur at the normalized eigenvectors with 
the maximal value as the eigenvalue for that eigenvector. But a more 
useful relationship is to see how (20) can be expressed when 6P is just 


a change in a single parameter coordinate. 


If we have the eigenvectors 


and eigenvalues of H (£ ) this is a fairly straightforward computation. 
For let J denote the matrix whose i— column is the i— normalized 
eigenvector of H(J? ) and let a denote the diagonal matrix whose i— 
diagonal element is the i— eigenvalue. The H(P*) = J a J T . Let 
6 P^(e) = (0, 0 0 0.) where 0 occupies the j— position. 


Sn^e) = 


S(P* + 6P j (0)) - S(P*) 


= (6P j (e)) T J aj t e. 


6P j (e) 

where e^. * — ^j- , |e|#o . 


Multiplying this right hand side out in detail gives 

, 4 k 2 

Sn.(e) = 8JL j.. . 

k 2 

So it is seen that the term z X* measures the relative change of 

1=1 th 

the sum of squares function in the j— coordinate direction. 

We did experience some difficulty in obtaining the eigenvalues and 
eigenvectors of the Hessians in running MOSES with some data sets. The 
IMSL subroutine would run into a floating point overflow and the program 
would terminate. This occurred in all models. The precise reason for 
this was not found but it is suspected that the smallest eigenvalue may 
have been too small for the precision of the machine. Another possibility 
is that the condition number, which is the ratio of the largest to the 
smallest eigenvalue, is too large, resulting in numerical instability in 
the calculation of the eigenvectors and their associated eigenvalues. 

(c) Suggestions for future work 

As a result of the experience gained from the clinic's work on the 
parameter extraction process for the MOSFET device several directions in 
which further investigations could usefully be made become apparent. 

Some of these correspond to relatively simple variations in the conditions 
under which the programs are run: there was insufficient time for the 
team to include these in its work. Others involve more major excursions 
into different aspects of MOSFET modelling. We outline our views on 
these matters under separate headings below. 


It is known that the operation of a MOSFET depends critically upon 

the nature of the inversion layer: upon how many carriers are in it, 

upon their mobility, etc. Therefore for an accurate description of how 

the MOSFET works we need the inversion layer carrier density per unit 

area N . 
e 

The effect of body- to- source reverse bias, V BS , on N g at fixed 
gate-to-source bias is a reduction tn qN g . Therefore a larger gate bias 
will be needed to cause inversion. 

It does not seem that this feature of the MOSFET is properly re- 
presented in the Ihantola and Spice 2 models and this indicates why the 
RMS error increases to unacceptable levels when they are used to fit data 
with body-to-source bias of -2 and -5 . (see Table 9). 

It should be possible to include Vg^ in these models in a more 
realistic way, so that the accuracy of the parameter extraction process 
will not be affected by necessary changes in V gs . 

(ii) Weighting and CLEAK 

The programs SARAH and MOSES possess the facility for giving more 

weight to some data points relative to others in optimising the sum of 

squares. In all cases run by the clinic equal weight was attached to all 

data points. Similarly a parameter CLEAK is chosen for each operation 

of the program, and is defined by 

d, = max ( 1 1. 1 , CLEAK) , 

J J-1...N J ' 

where d. appears in the sum of squares, 

J 


w. is the weighting factor and N is the number of data points. Thus 

J 

★ 

by appropriately choosing CLEAK, S can be made to denote a relative 
sum of squares, an absolute sum of squares or a mixture of the two. The 
choice of CLEAK in the clinic's work was such that d. = 1 . 

J 

Now it was found that much improvement of fit is obtained in the 
Ihantola and Spice 2 models if the data for » 2 is excluded. Further, 
the fit tends to be worse for lower values of . It may be that 
these problems could be tackled by varying the w^ so that low V Q j 
and Vqj values carry greater weight than others. Similar advantages may 
be gained by altering the CLEAK parameter although we note, as mentioned 
in Morris and Everson (1984), that at very low currents it is more appro- 
priate to use the absolute sum of squares to avoid very large errors. It 
is recommended that this area be investigated. 

(iii) Role of Constraints 

The constraints on the parameters, imposed in the SARAH program, can be 
adjusted and the chosen limits can have an important effect on the running 
of the program. Over-relaxed constraints can produce estimated parameters 
from SARAH which make the results from the MOSES program less predictable. 
In extreme cases, the model evaluation may fail due, for example, to nega- 
tive square roots occurring. Such modifications are associated with the 
choice of the parameter NSRCH in SARAH (number of starting points) and a 
systematic study of the effect of variation of constraints and NSRCH 



would be of interest. In fact, NSRCH is the only control that one has 
in finding the global minimum and repetition of results for two values 
of this parameter (differing say by a factor two) was taken as satisfactory 
evidence that the global minimum had in fact been attained in the search 
area. This procedure was only carried out in a few cases, however. 

The related question of choosing ’realistic' values of parameters is 
also important. For example, P6 and P7 in the Spice 2 model (length 
and width parameters) were constrained to be within ±20% of the device 
measurements and the resulting predictions led MOSES to an improved fit 
of the model- to the data. In other cases the choice is much less clear 
cut and the parameter Vpg seemed particularly difficult to estimate. 

(tv) Non-convergence 

It was found on a number of data sets that Ihantola and Spice 2 
would not converge using the MOSES program. What seemed to occur was 
that the algorithm was causing certain parameters to oscillate slowly 
around a closed orbit. It would be worthwhile to obtain a clear analysis 
of the weakness of the Levenberg-Marquardt algorithm in its application to 
this problem. 

(v) Other models 

The clinic concentrated on three one-dimensional models of the more 
straightforward type. No study was made of more elaborate models such as 
the Pao-Sah model or possibly two-dimensional models. These are computa- 
tionally expensive models but are expected to be more accurate. An 


analysis of these models might well pay dividends in a more complete 
understanding of the device behaviour as well as showing more explicitly 
the limitations of the one-dimensional models which were studied. 

(vi) Inverse theory 

There is a branch of mathematics that is concerned with the analysis 
of parameter estimation problems (as well as inverse eigenvalue problems, 
inverse scattering problems and many others). Our problem, of MOSFET 
parameter estimation, has its most general formulation in terms of this 
theory. It is possible that framing the problem within such a formulation 
could produce some practical results. From preliminary literature searches 
no publications were located on this topic in spite of the fact that 
numerous applications of this theory in other engineering disciplines 
have met with some success. An attempt to accomplish this analysis will 
be made hy a member of the clinic over the summer of 1985 and results, 
if any, will be made available to. those who might be interested. 
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ERRATUM 


The results obtained in the Brews model (Model 6) assumed an incorrect 
dependence on the parameter Vp g . Necessary corrections amount to replacing 
Vqs by Vgj - Vpg everywhere after label 501 in SUBROUTINE BREWS (nine 
substitutions) . 



